645 research outputs found

    Directionality of point mutation and 5-methylcytosine deamination rates in the chimpanzee genome

    Get PDF
    Background The pattern of point mutation is important for studying mutational mechanisms, genome evolution, and diseases. Previous studies of mutation direction were largely based on substitution data from a limited number of loci. To date, there is no genome-wide analysis of mutation direction or methylation-dependent transition rates in the chimpanzee or its categorized genomic regions. Results In this study, we performed a detailed examination of mutation direction in the chimpanzee genome and its categorized genomic regions using 588,918 SNPs whose ancestral alleles could be inferred by mapping them to human genome sequences. The C→T (G→A) changes occurred most frequently in the chimpanzee genome. Each type of transition occurred approximately four times more frequently than each type of transversion. Notably, the frequency of C→T (G→A) was the highest in exons among the genomic categories regardless of whether we calculated directly, normalized with the nucleotide content, or removed the SNPs involved in the CpG effect. Moreover, the directionality of the point mutation in exons and CpG islands were opposite relative to their corresponding intergenic regions, indicating that different forces govern the nucleotide changes. Our analysis suggests that the GC content is not in equilibrium in the chimpanzee genome. Further quantitative analysis revealed that the 5-methylcytosine deamination rates at CpG sites were highly dependent on the local GC content and the lengths of SNP flanking sequences and varied among categorized genomic regions. Conclusion We present the first mutational spectrum, estimated by three different approaches, in the chimpanzee genome. Our results provide detailed information on recent nucleotide changes and methylation-dependent transition rates in the chimpanzee genome after its split from the human. These results have important implications for understanding genome composition evolution, mechanisms of point mutation, and other genetic factors such as selection, biased codon usage, biased gene conversion, and recombination

    Comparative Analysis of CpG Islands in Four Fish Genomes

    Get PDF
    There has been much interest in CpG islands (CGIs), clusters of CpG dinucleotides in GC-rich regions, because they are considered gene markers and involved in gene regulation. To date, there has been no genome-wide analysis of CGIs in the fish genome. We first evaluated the performance of three popular CGI identification algorithms in four fish genomes (tetraodon, stickleback, medaka, and zebrafish). Our results suggest that Takai and Jones' (2002) algorithm is most suitable for comparative analysis of CGIs in the fish genome. Then, we performed a systematic analysis of CGIs in the four fish genomes using Takai and Jones' algorithm, compared to other vertebrate genomes. We found that both the number of CGIs and the CGI density vary greatly among these genomes. Remarkably, each fish genome presents a distinct distribution of CGI density with some genomic factors (e.g., chromosome size and chromosome GC content). These findings are helpful for understanding evolution of fish genomes and the features of fish CGIs

    Features of Recent Codon Evolution: A Comparative Polymorphism-Fixation Study

    Get PDF
    Features of amino-acid and codon changes can provide us important insights on protein evolution. So far, investigators have often examined mutation patterns at either interspecies fixed substitution or intraspecies nucleotide polymorphism level, but not both. Here, we performed a unique analysis of a combined set of intra-species polymorphisms and inter-species substitutions in human codons. Strong difference in mutational pattern was found at codon positions 1, 2, and 3 between the polymorphism and fixation data. Fixation had strong bias towards increasing the rarest codons but decreasing the most frequently used codons, suggesting that codon equilibrium has not been reached yet. We detected strong CpG effect on CG-containing codons and subsequent suppression by fixation. Finally, we detected the signature of purifying selection against A∣U dinucleotides at synonymous dicodon boundaries. Overall, fixation process could effectively and quickly correct the volatile changes introduced by polymorphisms so that codon changes could be gradual and directional and that codon composition could be kept relatively stable during evolution

    A Convergent Study of Genetic Variants Associated With Crohn’s Disease: Evidence From GWAS, Gene Expression, Methylation, eQTL and TWAS

    Get PDF
    Crohn’s Disease (CD) is one of the predominant forms of inflammatory bowel disease (IBD). A combination of genetic and non-genetic risk factors have been reported to contribute to the development of CD. Many high-throughput omics studies have been conducted to identify disease associated risk variants that might contribute to CD, such as genome-wide association studies (GWAS) and next generation sequencing studies. A pressing need remains to prioritize and characterize candidate genes that underlie the etiology of CD. In this study, we collected a comprehensive multi-dimensional data from GWAS, gene expression, and methylation studies and generated transcriptome-wide association study (TWAS) data to further interpret the GWAS association results. We applied our previously developed method called mega-analysis of Odds Ratio (MegaOR) to prioritize CD candidate genes (CDgenes). As a result, we identified consensus sets of CDgenes (62–235 genes) based on the evidence matrix. We demonstrated that these CDgenes were significantly more frequently interact with each other than randomly expected. Functional annotation of these genes highlighted critical immune-related processes such as immune response, MHC class II receptor activity, and immunological disorders. In particular, the constitutive photomorphogenesis 9 (COP9) signalosome related genes were found to be significantly enriched in CDgenes, implying a potential role of COP9 signalosome involved in the pathogenesis of CD. Finally, we found some of the CDgenes shared biological functions with known drug targets of CD, such as the regulation of inflammatory response and the leukocyte adhesion to vascular endothelial cell. In summary, we identified highly confident CDgenes from multi-dimensional evidence, providing insights for the understanding of CD etiology

    The Potential Roles of Long Noncoding RNAs (lncRNA) in Glioblastoma Development

    Get PDF
    Long noncoding RNA (lncRNA) may contribute to the initiation and progression of tumor. In this study, we first systematically compared lncRNA and mRNA expression between glioblastoma and paired normal brain tissues using microarray data. We found 27 lncRNA and 82 mRNA significantly upregulated in glioblastoma, as well as 198 lncRNA and 285 mRNA significantly downregulated in glioblastoma. We identified 138 coexpressed lncRNA–mRNA pairs from these differentially expressed lncRNA and genes. Subsequent pathway analysis of the lncRNA-paired genes indicated that EphrinB–EPHB, p75-mediated signaling, TNFα/NF-κB, and ErbB2/ErbB3 signaling pathways might be altered in glioblastoma. Specifically, lncRNA RAMP2-AS1 had significant decrease of expression in glioblastoma tissues and showed coexpressional relationship with NOTCH3, an important tumor promoter in many neoplastic diseases. Our follow up experiment indicated that (i) an overexpression of RAMP2-AS1 reduced glioblastoma cell proliferation in vitro and also reduced glioblastoma xenograft tumors in vivo; (ii) NOTCH3 and RAMP2-AS1 coexpression rescued the inhibitory action of RAMP2-AS1 in glioblastoma cells; and (iii) RNA pull-down assay revealed a direct interaction of RAMP2-AS1 with DHC10, which may consequently inhibit, as we hypothesize, the expression of NOTCH3 and its downstream signaling molecule HES1 in glioblastoma. Taken together, our data revealed that lncRNA expression profile in glioblastoma tissue was significantly altered; and RAMP2-AS1 might play a tumor suppressive role in glioblastoma through an indirect inhibition of NOTCH3. Our results provided some insights into understanding the key roles of lncRNA–mRNA coregulation in human glioblastoma and the mechanisms responsible for glioblastoma progression and pathogenesis. Mol Cancer Ther; 15(12); 2977–86. ©2016 AACR

    Network analysis of gene fusions in human cancer

    Full text link

    A novel statistical method to estimate the effective SNP size in vertebrate genomes and categorized genomic regions

    Get PDF
    Background The local environment of single nucleotide polymorphisms (SNPs) contains abundant genetic information for the study of mechanisms of mutation, genome evolution, and causes of diseases. Recent studies revealed that neighboring-nucleotide biases on SNPs were strong and the genome-wide bias patterns could be represented by a small subset of the total SNPs. It remains unsolved for the estimation of the effective SNP size, the number of SNPs that are sufficient to represent the bias patterns observed from the whole SNP data. Results To estimate the effective SNP size, we developed a novel statistical method, SNPKS, which considers both the statistical and biological significances. SNPKS consists of two major steps: to obtain an initial effective size by the Kolmogorov-Smirnov test (KS test) and to find an intermediate effective size by interval evaluation. The SNPKS algorithm was implemented in computer programs and applied to the real SNP data. The effective SNP size was estimated to be 38,200, 39,300, 38,000, and 38,700 in the human, chimpanzee, dog, and mouse genomes, respectively, and 39,100, 39,600, 39,200, and 42,200 in human intergenic, genic, intronic, and CpG island regions, respectively. Conclusion SNPKS is the first statistical method to estimate the effective SNP size. It runs efficiently and greatly outperforms the algorithm implemented in SNPNB. The application of SNPKS to the real SNP data revealed the similar small effective SNP size (38,000 – 42,200) in the human, chimpanzee, dog, and mouse genomes as well as in human genomic regions. The findings suggest strong influence of genetic factors across vertebrate genomes

    Functional Analysis of Single Nucleotide Polymorphisms Associated with Type 2 Diabetes

    Full text link
    Type 2 diabetes (T2D), a metabolic disorder characterized by insulin resistance and relative insulin deficiency, is a life-long, common, complex disease of major public health importance. To date, there have been 86 published studies that have reported 639 associations between single nucleotide polymorphisms (SNPs) and T2D in the GWAS Catalog database, and others studies in literature. However, the majority (~93%) of the SNPs emerging from these studies are located within noncoding sequence, complicating their functional evaluation. Recently, several lines of evidence have suggested the involvement of a proportion of such variants in transcriptional regulatory mechanisms, including modulation of promoter and enhancer elements and enrichment within expression quantitative trait loci (eQTL). In this study, we downloaded T2D-associated SNPs from GWASdb, a derived database that included the data from GWAS Catalog. We then annotated them with transcription factor (TF) motif, promoter/enhancer, and eQTL information followed by the construction of a TF-target network module, in order to better detect the underlying mechanism of genetic variants involving in T2D. We found that T2D associated SNPs were significantly enriched with functional information. In addition, we found that functional annotations could significantly improve the power of detecting causal variants and understanding their pathogenesis. Using the data collected from the Gene-Tissue Expression Project (GTEx), we could further find the target genes for those eQTL SNPs. When cross-referencing with the Drug Bank database, we were able to discover certain drugs that might regulate the expression of these genes and fight against T2D
    corecore